circumventing defense
Reviews: A Little Is Enough: Circumventing Defenses For Distributed Learning
In general, I like the question this paper asked, i.e., whether or not it is necessary to impose a large deviation from the model parameters in order to attack distributed learning. Most of the research in Byzantine tolerant distributed learning, including Krum, Bulyan, and Trimmed Mean, uses some statistically "robust aggregation" instead of simple mean at the PS to mitigate the effects of adversaries. By the nature of robust statistics, all of those methods takes positive answer to the above question as granted, which serves as a cornerstone for their correctness. Thus, the fact that this paper gives a negative answer is inspiring and may force researchers to rethink about whether or not robust aggregation is enough for Byzantine tolerant machine learning. However, the author seems not aware of DRACO (listed below), which is very different from the baselines considered in this paper.
A Little Is Enough: Circumventing Defenses For Distributed Learning
Distributed learning is central for large-scale training of deep-learning models. However, it is exposed to a security threat in which Byzantine participants can interrupt or control the learning process. Previous attack models assume that the rogue participants (a) are omniscient (know the data of all other participants), and (b) introduce large changes to the parameters. Accordingly, most defense mechanisms make a similar assumption and attempt to use statistically robust methods to identify and discard values whose reported gradients are far from the population mean. We observe that if the empirical variance between the gradients of workers is high enough, an attacker could take advantage of this and launch a non-omniscient attack that operates within the population variance.
A Little Is Enough: Circumventing Defenses For Distributed Learning
Baruch, Gilad, Baruch, Moran, Goldberg, Yoav
Distributed learning is central for large-scale training of deep-learning models. However, it is exposed to a security threat in which Byzantine participants can interrupt or control the learning process. Previous attack models assume that the rogue participants (a) are omniscient (know the data of all other participants), and (b) introduce large changes to the parameters. Accordingly, most defense mechanisms make a similar assumption and attempt to use statistically robust methods to identify and discard values whose reported gradients are far from the population mean. We observe that if the empirical variance between the gradients of workers is high enough, an attacker could take advantage of this and launch a non-omniscient attack that operates within the population variance.
A Little Is Enough: Circumventing Defenses For Distributed Learning
Baruch, Moran, Baruch, Gilad, Goldberg, Yoav
Distributed learning is central for large-scale training ofdeep-learning models. However, they are exposed to a security threat in which Byzantine participants can interrupt or control the learning process. Previous attack models and their corresponding defensesassume that the rogue participants are (a) omniscient (know the data of all other participants), and (b) introduce large change to the parameters. We show that small but wellcrafted changesare sufficient, leading to a novel non-omniscient attack on distributed learning that go undetected by all existing defenses. We demonstrate ourattack method works not only for preventing convergencebut also for repurposing of the model behavior ("backdooring"). We show that 20% of corrupt workers are sufficient to degrade aCIFAR10 model's accuracy by 50%, as well as to introduce backdoors into MNIST and CIFAR10 models without hurting their accuracy.
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
Athalye, Anish, Carlini, Nicholas, Wagner, David
We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. For each of the three types of obfuscated gradients we discover, we describe characteristic behaviors of defenses exhibiting this effect and develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 8 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely and 1 partially.
- Asia (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Summary/Review (0.93)
- Research Report > New Finding (0.68)
- Information Technology > Security & Privacy (0.46)
- Government (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.30)